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Abstract 


Coronaviruses have exceptionally large RNA genomes of approximately 30 kilobases. 
Genome replication and transcription is mediated by a multisubunit protein complex 
comprised of more than a dozen virus-encoded proteins. The protein complex is 
hought to bind specific cis-acting RNA elements primarily located in the 5’- and 3’-ter- 
minal genome regions and upstream of the open reading frames located in the 3’-prox- 
imal one-third of the genome. Here, we review our current understanding of 
coronavirus cis-acting RNA elements, focusing on elements required for genome rep- 
ication and packaging. Recent bioinformatic, biochemical, and genetic studies suggest 
a previously unknown level of conservation of cis-acting RNA structures among different 
coronavirus genera and, in some cases, even beyond genus boundaries. Also, there is 
increasing evidence to suggest that individual cis-acting elements may be part of 
higher-order RNA structures involving long-range and dynamic RNA-RNA interactions 
between RNA structural elements separated by thousands of nucleotides in the viral 
genome. We discuss the structural and functional features of these cis-acting RNA ele- 
ments and their specific functions in coronavirus RNA synthesis. 
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1. INTRODUCTION 


Coronaviruses are enveloped, positive-strand RNA viruses. They 
have been united in the subfamily Coronavirinae within the family 
Coronaviridae (de Groot et al., 2012a; Masters and Perlman, 2013). Together 
with three other families (Arteriviridaec, Roniviridae, and Mesoniviridae), the 
Coronaviridae form the order Nidovirales (de Groot et al., 2012b). According 
to the current classification, the family Coronaviridae comprises four genera 
called Alpha-, Beta-, Gamma-, and Deltacoronavirus. In some cases, these gen- 
era have been further subdivided into lineages. Coronaviruses infect a wide 
range of mammals and birds and include pathogens of major medical, vet- 
erinary, and economic interest (de Groot et al., 2012a; Fehr and Perlman, 
2015; Masters and Perlman, 2013), with severe acute respiratory syndrome 
(SARS) coronavirus (SARS-CoV), and Middle East respiratory syndrome 
(MERS) coronavirus (MERS-CoV) providing two prominent examples 
of zoonotic coronaviruses causing severe respiratory disease in humans 
(Drosten et al., 2003; Ksiazek et al., 2003; Vijay and Perlman, 2016; Zaki 
et al., 2012; Zumla et al., 2015). 

Among plus-strand RNA viruses, coronaviruses and related nidoviruses 
stick out by their large genome size of about 30 kilobases (kb), the synthesis 
of numerous subgenomic mRNAs, and the large number of nonstructural 
proteins (nsps) involved in viral RNA synthesis and interactions with host 
cell functions (reviewed in Masters and Perlman, 2013; Ziebuhr, 2008). 
Most of the nsps are encoded by the viral replicase gene that occupies the 
5’-terminal two-thirds of the genome and is comprised of two large open 
reading frames, ORFla and ORF1b. Translation of ORF1a yields poly- 
protein (pp) 1a (~450 kDa). Translation of ORF1b requires a programmed 
ribosomal frameshift event (Brierley et al., 1987, 1989) that occurs just 
upstream of the ORFla stop codon and results in pplab (~750 kDa). 
Co- and posttranslational cleavage of ppla/lab by two types of virus- 
encoded proteases associated with nsp3 and nsp5 (Mielech et al., 2014; 
Ziebuhr et al., 2000) gives rise to a total of 15—16 mature proteins that form 
the viral replication—transcription complex (RTC) which is thought to also 
involve the nucleocapsid protein and several cellular proteins (Almazan 
et al., 2004; Schelle et al., 2005; Ziebuhr, 2008; Ziebuhr et al., 2000). This 
multiprotein complex replicates the viral genome and produces an extensive 
set of 3’-coterminal subgenomic messenger RNAs (sg mRNAs), the latter 
representing a hallmark of corona- and other nidoviruses (Pasternak et al., 
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2006; Sawicki et al., 2007; Ziebuhr and Snijder, 2007). The sg mRNAs are 
used to express the genes located downstream of the replicase gene, involv- 
ing the viral structural proteins (nucleocapsid (N), membrane (M), spike (S), 
and envelope (E) protein) and several accessory proteins that, in many cases, 
have been implicated in functions that interfere with antiviral host responses 
(Liu et al., 2014; Masters and Perlman, 2013; Narayanan et al., 2008b). 

In this chapter, we will briefly summarize coronavirus RNA synthesis 
and then discuss the structural and functional features of currently known 
cis-acting RNA elements located in the 5'- and 3/-terminal untranslated 
regions (UTR) and neighboring coding regions. Also, we will review the 
current knowledge of signals required for packaging and of cellular proteins 
presumed to be involved in viral RNA synthesis. 


2. CORONAVIRUS GENOME REPLICATION AND 
TRANSCRIPTION 


Following receptor-mediated entry into the host cell, the viral genome 
RNA, which is 5’-capped and 3’-polyadenylated, is released from the nucle- 
ocapsid and used for translation of the 5’-terminal ORFs 1a and 1b to produce 
the key components of the viral RTC. The complex is anchored by 
membrane-spanning domains (residing in nsp3, 4, and 6) to virus-induced 
membranous structures that provide a scaffold for the protein machinery 
involved in viral RNA synthesis (den Boon and Ahlquist, 2010; Gosert 
et al., 2002; Kanjanahaluethai et al., 2007; Knoops et al., 2008; Oostra 
et al., 2007, 2008; Snijder et al., 2006; van Hemert et al., 2008). Over the past 
years, a wealth of information has been obtained on enzymatic and other 
functions, three-dimensional structures and interactions of individual nsps 
produced from ppla and pplab (reviewed in Imbert et al., 2010; Masters, 
2006; Ulferts et al., 2010; Ziebuhr, 2008). The studies show that, in addition 
to common enzymes conserved in most +RNA viruses, such as RNA- 
dependent RNA polymerase (RdRp) (te Velthuis et al., 2010), helicase/ 
NTPase (Seybert et al., 2000), proteases (Baker et al., 1989; Ziebuhr et al., 
1995), 5’ cap-specific methylases (Chen et al., 2009b; Decroly et al., 2008, 
2011), coronaviruses encode an extra set of proteins in their replicase genes. 
These additional (sometimes even unique) enzymatic functions include a 
3'-5' exoribonuclease (Minskaia et al., 2006; Snijder et al., 2003) that is 
thought to be involved in mechanisms required for high-fidelity replication 
of nidovirus (including coronavirus) genomes of more than 20 kb (Eckerle 
et al., 2010; Minskaia et al., 2006; Smith et al., 2013, 2014) and a 
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uridylate-specific endoribonuclease of currently unknown function that was 
found to be conserved in all vertebrate nidoviruses (Ivanov et al., 2004; Nga 
et al., 2011; Ulferts and Ziebuhr, 2011). In some cases, the replicase gene- 
encoded enzymes could be linked to specific steps of viral RNA synthesis 
and/or RNA processing or were shown to interfere with cellular functions 
(reviewed in Fehr and Perlman, 2015; Masters and Perlman, 2013; 
Ziebuhr, 2008). Interactions between different nsps have been predicted 
and characterized for a large number of proteins and the structural basis 
and possible functional implications of these interactions has been a major 
topic of research. For example, it has been shown that the exoribonuclease 
and ribose 2/-O-methyltransferase activities associated with nsp14 and 
nsp16, respectively, are stimulated by nsp10 and the interacting surfaces have 
been identified by mutagenesis and structural studies (Bouvet et al., 2014; 
Decroly et al., 2011; Ma et al., 2015). Also, there is evidence that a 
hexadecameric complex formed by eight molecules of nsp7 and eight mole- 
cules of nsp8 assists the RdRp by acting as a processivity factor (Subissi et al., 
2014; Zhai et al., 2005). Additional interactions between individual subunits 
of the RTC have been suggested on the basis of two-hybrid screening data 
(Pan et al., 2008; von Brunn et al., 2007) and there is evidence that a large 
number of coronavirus nsps assemble to form homo- or heterooliigomeric 
complexes (Anand et al., 2002, 2003; Bouvet et al., 2014; Chen et al., 
2011; Ma et al., 2015; Ricagno et al., 2006; Su et al., 2006; Xiao et al., 
2012; Zhai et al., 2005). 

Coronaviruses produce a set of 5°- and 3/-coterminal sg mRNAs 
that contain a common 5/-leader sequence of about 60-95 nt (Spaan 
et al., 1983). The sequence of this leader is identical to the 5’-terminal 
sequence of the viral gnome RNA. Synthesis of coronavirus sg mRNAs 
is thought to involve a “discontinuous” step during negative-strand RNA 
synthesis (Sawicki and Sawicki, 1995). Specific proteins of the RTC that 
are required for (or involved in) this discontinuous extension step remain 
to be identified while important cis-acting RNA elements, called 
“transcription-regulating sequences” (TRSs), that are required for this step 
have been characterized for a number of coronaviruses (reviewed in Sola 
et al., 2011b, 2015). TRSs are located downstream of the 5’-leader on the 
genome (“leader-TRS,” TRS-L) and upstream of each of the major 
ORFs present in the 3/-proximal genome region (“body-TRSs,” TRS- 
B). They play a vital role in supporting the transfer of the nascent minus 
strand from a distant position in the 3/-proximal genome region to the 
TRS-L located near the 5/-end of the genome following attenuation of 
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minus-strand RNA synthesis at one of the TRS-B. Coronavirus TRSs 
contain an AU-rich motif of about 10 nucleotides that is involved in base- 
pairing interactions between the TRS-L and the complement of a body- 
TRS (Sawicki and Sawicki, 1995, 1998; Sawicki et al., 2007; Sethna et al., 
1991). Following transfer of the nascent minus strand from its downstream 
position on the template (at the TRS-B) to the TRS-L close to the 5’ end 
of the genome, negative-strand RNA synthesis is resumed and completed 
by copying the 5’ leader sequence. The resulting set of 3/ antileader- 
containing sg minus-strand RNAs is subsequently used as templates for 
the production of the characteristic nested set of 5’ leader-containing 
mRNAs in coronavirus-infected cells (Lai et al., 1983; Sawicki and 
Sawicki, 1995; Sawicki et al., 2001; Sethna et al., 1989; Spaan et al., 
1983). Sg minus-strand RNAs contain a U-stretch at their 5’ end, provid- 
ing a possible template for 3’ polyadenylation of sg mRNAs (Hofmann 
and Brian, 1991; Wu et al., 2013). 

As mentioned earlier, the cis-acting RNA elements required for corona- 
virus replication (and transcription) are located in the 5/- and 3/-terminal 
genome regions and largely (but not exclusively) encompass noncoding 
regions (Chang et al., 1994; Dalton et al., 2001; Izeta et al., 1999; Kim 
et al., 1993; Liao and Lai, 1994; Lin et al., 1994, 1996; Zhang et al., 
1994). Additional cis-acting elements are located at internal positions and 
include the TRS elements involved in transcription as well as specific 
RNA signals required for genome packaging (Chen et al., 2007; Escors 
et al., 2003; Makino et al., 1990; Morales et al., 2013; Penzes et al., 
1994). Another important RNA structural element is located in the 
ORF1a—ORF1b overlap region. This complex pseudoknot structure medi- 
ates a (— 1) ribosomal frameshift event and thus controls the expression of the 
second large ORF on the coronavirus gnome RNA (ORF1b) (Brierley 
et al., 1987, 1989; de Haan et al., 2002; Namy et al., 2006). 


3. CORONAVIRUS cis-ACTING RNA ELEMENTS 


Historically, cis-acting RNA elements essential for coronavirus RNA 
synthesis have mainly been characterized using naturally occurring and 
genetically engineered defective interfering RNAs (DI RNAs) (reviewed 
in Brian and Baric, 2005; Brian and Spaan, 1997; Masters, 2007; Sola 
etal., 2011b). DI RNAs are relatively short RNAs that are derived from viral 
genome RNA but lack large (internal) sequence parts. DI RNAs are repli- 
cated in cells provided that a suitable (i.e., closely related) helper virus 
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provides functional replicase complexes in trans (Levis et al., 1986; Weiss 
et al., 1983) and that the DI RNA contains all the cis-acting RNA signals 
required for replication. In general, DI RNAs contain the entire 5'- and 
3'-untranslated genome regions and, in most cases, also small parts of neigh- 
boring (or internal) coding regions (Lin and Lai, 1993). Coronavirus DI 
RNAs were first reported and most extensively studied for the 
betacoronaviruses MHV and BCoV (Chang et al., 1994; de Groot et al., 
1992; Hofmann et al., 1990; Luytjes et al., 1996; Makino et al., 1984, 
1985, 1988a,b). Subsequently, DI RNAs were also identified and character- 
ized in alpha- and gammacoronaviruses (Izeta et al., 1999; Mendez et al., 
1996; Penzes et al., 1994, 1996). 

Identification and characterization of DI RNAs in various coronaviruses 
have been instrumental in mapping the minimal RNA sequences and struc- 
tures required for replication and packaging. A major problem in studies 
using DI RNAs for defining elements required for replication was the high- 
frequency homologous recombination between the RNA replicon and the 
helper virus genome. For example, BCoV-derived artificial DI RNAs con- 
taining base substitutions within 5’ leader sequences rapidly acquired the 
leader sequence of the helper virus (Chang et al., 1994, 1996; Makino 
and Lai, 1990). This “leader switching” was regularly observed in serial pas- 
saging experiments aimed to rescue (or amplify) DI RNAs for further phe- 
notypic characterization. With the development of a range of coronavirus 
reverse genetic systems, the manipulation of full-length coronavirus cDNA 
copies for functional characterization of cis-acting RNA elements at the 
genome level (including long-range RNA-RNA interactions) has now 
become an attractive alternative to overcome some of the limitations of 
the DI RNA-based systems used previously (Almazan et al., 2000; Casais 
et al., 2001; Scobey et al., 2013; Tekes et al., 2008; Thiel et al., 2001; 
van den Worm et al., 2012; Yount et al., 2000, 2003). 


3.1 5/-Terminal cis-Acting RNA Elements 


DI RNA-based studies performed with representative betacoronaviruses 
(MHV and BCoV) revealed that approximately 500 nt from the genomic 
5’ end (467 nt in MHV and 498 nt in BCoV) are required for replication 
(Chang et al., 1994; Kim et al., 1993; Luytjes et al., 1996). Similar 5’-terminal 
sequence requirements were established in subsequent studies for the 
alphacoronavirus TGEV (649nt) (Escors et al, 2003) and the 
gammacoronavirus IBV (544 nt) (Dalton et al., 2001). These DI RNAs 
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contained the entire 5’ UTR, ranging in size from 210 nt (MHV, BCoV, and 
HCoV-OC43) to 314 nt (TGEV), and a part of the replicase gene (from the 
nsp1-coding region) (see later). In contrast to alpha- and betacoronaviruses, 
the gammacoronavirus IBV features a larger 5’ UTR (528 nt) (Boursnell 
etal., 1987) and lacks an equivalent of nsp1 (Ziebuhr et al., 2001). In this case, 
the 5’ UTR alone appears to contain all the signals required for genome 
replication. 


3.1.1 Structural Features of Coronavirus 5'-Terminal cis-Acting 
Elements 

The majority of the 5’-proximal RNA structures and sequences essential for 
coronavirus genome replication have first been characterized for BCoV 
using DI RNA-based systems (Brown et al., 2007; Chang et al., 1994, 
1996; Gustin et al., 2009; Raman and Brian, 2005; Raman et al., 2003). 
The 5/-proximal 215 nts of the BCoV genome were predicted to harbor 
four stem-loops (SLs) that, in the older literature, were termed SL 
I (comprised of Ia and Ib), II, III, and IV. The structures were identified 
by in vitro structure probing analysis of appropriate DI RNAs and their 
cis-acting functions were investigated by DI RNA replication studies and 
mutation analysis. More recently, two additional SLs called SL-V and SL-VI 
were identified in the BCoV nsp1-coding region, with SL-VI being essential 
for DI RNA replication (Brown et al., 2007). 

Unlike BCoV, MHV is predicted to contain three conserved SLs, SL1, 
SL2, and SL4, in this 5’-terminal genome region (Fig. 1). Using 5’-terminal 
genome sequences of about 140 nts of nine coronaviruses, including five 
betacoronaviruses (BCoV, human coronavirus (HCoV) OC43, HCoV- 
HKU1, SARS-CoV, and MHV-A59), three alphacoronaviruses (HCoV- 
NL63, HCoV-229E, and TGEV), and one gammacoronavirus (IBV), the 
Leibowitz and Giedroc laboratories proposed a consensus 5/-terminal 
RNA secondary structure model (Kang et al., 2006; Liu et al., 2007) that 
includes three highly conserved hairpin structures, SL1, SL2, and SL4. This 
model was confirmed and extended by genus-wide alignment-based sec- 
ondary structure predictions using LocARNA (Madhugiri et al., 2014; 
Smith et al., 2010; Will et al., 2007, 2012) in which, despite profound 
sequence diversity in this genome region, three highly conserved SLs 
SL1, SL2, and SL4 were identified in the 5/-terminal 150-nt betacoronavirus 
genome regions (Madhugiri et al., 2014) (Fig. 1). 

Interestingly, the BCoV and SARS-CoV genome RNAs were predicted 
to accommodate an additional SL (called SL3) in the region between SL2 
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Fig. 1 Conserved cis-acting RNA elements in the 5’- and 3’-proximal genome regions 
of coronaviruses. Shown is the coronavirus genome organization with the two large 
5’ ORFs, 1a and 1b, that together constitute the replicase gene, while details of structural 
and accessory protein ORFs are not shown. Black circles at the RNA 5’ ends indicate the 
5’ cap structure, while (A),, indicates the 3’ poly(A) tail. The —1 ribosomal frameshift sig- 
nal (RFS) at the ORF1a/1b junction site is indicated by an asterisk. 5, S gene; N, N gene. 
Approximate positions of the packaging signals (PS) determined for MHV and TGEV are 
indicated by arrows. (A) Schematic representation of RNA structural elements in the 
5/-terminal genome regions of MHV, BCoV, and HCoV-229E. Filled boxes indicate the 
leader-TRS (TRS-L). Boxes in light gray indicate the start codons of the uORF(s) located 
upstream of ORF 1a. Boxes in dark gray indicate the position of the ORF1a start codon. 
(B) Schematic representation of RNA structural elements in the 3’/-terminal genome 
regions of MHV, BCoV, and HCoV-229E. Major conserved RNA structural elements are 
shown, together with base-pairing interactions required to form a pseudoknot (PK) 
structure. Also shown is the position of a highly conserved octanucleotide sequence 
that is located in a single-stranded region. BSL, bulged stem-loop; L, loop; S, stem; SL, 
stem-loop structure; HVR, hypervariable region; PK, pseudoknot. 
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and SL4. SL3 is predicted to adopt a stable hairpin structure containing the 
TRS-L (Fig. 1). The formation of an equivalent SL3 structure can also be 
forced for MHV and several other betacoronaviruses (Chen and 
Olsthoorn, 2010; Madhugiri et al., 2014), although this structural element 
would only contain two conserved base pairs and was predicted to be unsta- 
ble at 37°C (Liu et al., 2007). In a recent study, we extended these studies 
and used multiple alignments calculated with LocARNA (Madhugiri et al., 
2014; Smith et al., 2010; Will et al., 2007, 2012) to identify conserved RNA 
structural elements conserved in the 5/-proximal genome regions of 
alphacoronaviruses (Madhugiri et al., 2014). The predicted structures were 
verified and refined by RNA structure probing analyses (Ehresmann et al., 
1987; Qu et al., 1983) using in vitro-transcribed RNAs with sequences 
corresponding to the 5/-terminal genome regions of HCoV-229E and 
HCoV-NL63, respectively. The combined structural and phylogenetic ana- 
lyses performed in different laboratories produce a rather coherent picture, 
with SL1, SL2, and SL4 representing cis-acting RNA elements that are 
highly conserved across different coronavirus genera despite pronounced 
sequence diversity in the respective 5/-terminal genome regions (Chen 
and Olsthoorn, 2010; Kang et al., 2006; Liu et al., 2007; Madhugiri 
et al., 2014). 

To further confirm the previously identified conserved betacoronavirus 
5'-proximal RNA secondary structures, a recent study used a selective 
2'-hydroxyl acylation and primer extension (SHAPE) methodology to 
determine the secondary structure of the 5/-terminal 474 nts region of the 
MHV-A59 genome RNA in the virus (in virio), after gentle extraction 
and deproteinization (ex virio) and an in vitro-transcribed RNA (Yang 
et al., 2015). With very few exceptions, the RNA secondary structures 
determined in this study essentially confirmed the previously characterized 
or predicted SL1, SL2, and SL4 structures (Fig. 1) (Li et al., 2008; Liu et al., 
2007, 2009; Yang et al., 2011). The SHAPE analyses also confirmed that the 
(weak) TRS-L-containing SL3 hairpin predicted previously by phyloge- 
netic algorithms (Chen and Olsthoorn, 2010) is part of a single-stranded 
region, consistent with previous predictions that this region is weekly paired 
or unpaired (Liu et al., 2007; Madhugiri et al., 2014). Also several other 
RNA secondary structures identified by SHAPE analysis corresponded very 
well to the previous models of MHV-A59 RNA secondary structures pro- 
posed by Brian and coworkers (Guan et al., 2011, 2012; Yang et al., 2015). 
Furthermore, the study provides biochemical support for the presence of 
additional hairpin structures in the MHV 5/-terminal genome region, 
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including SL5a (designated earlier as SL-IV), SL5b, SL5c, SL6, and SL7. 
An Alphacoronavirus genus-wide bioinformatics study revealed a very well 
conserved higher-order RNA structure (comprising 5a, 5b, and 5c) in 
an equivalent genome region (Madhugiri et al., 2014). The predicted 
SL5a, b, and c structures were confirmed and refined by in vitro RNA struc- 
ture probing information obtained for the 5/-terminal 600 nts of HCoV- 
229E and HCoV-NL63 (Madhugiri et al., 2014; unpublished data). Also, 
the study identified significant constraints in the alphacoronavirus SL5 as 
judged by the large number of covariant base pairs, suggesting an important 
function in alphacoronavirus RNA synthesis, possibly related to that 
described for the betacoronavirus MHV-A59 SL-IV (=SL5a) in supporting 
efficient viral replication. Furthermore, SL5 was suggested to be involved in 
long-range RNA-RNA interactions (Guan et al., 2012), which was 
found to be in good agreement with the SHAPE analysis data (Yang 
et al., 2015). 

Downstream of SL5, additional SL structures (SL6, 7, and 8) were iden- 
tified. The available evidence suggests that these structures are less well con- 
served among MHV, BCoV, and SARS-CoV and probably play a less 
important role in viral replication (Brockway and Denison, 2005; Yang 
et al., 2015). 

Taken together, the available information suggests a model in which the 
5’-terminal ~320-nt genome regions of both alpha- and betacoronaviruses 
contain four major RNA structural elements called SL1, SL2, SL4, and SL5 
(Chen and Olsthoorn, 2010; Kang et al., 2006; Liu et al., 2007; Madhugiri 
etal., 2014; Yang et al., 2015) (see Fig. 1). The conservation of the SL1, SL2, 
SL4, and SL5abc RNA structural elements (despite pronounced nucleotide 
sequence divergency) suggests important functions for these structures in the 
coronavirus life cycle. Functional features of individual structural elements 
will be discussed later in more detail. 


3.1.2 Functional Roles of Coronavirus 5'-Terminal cis-Acting Elements 
In contrast to the growing body of information on structures and their con- 
servation in the coronavirus 5/-terminal genome region across all genera of 
coronaviruses, the functional significance of the individual SL structures has 
almost exclusively been studied for two (closely related) betacoronaviruses, 
MHV and BCoV. The structural and functional conservation inferred from 
these studies for 5’-terminal betacoronavirus cis-acting elements was sub- 
stantiated by reverse genetic data demonstrating that SARS-CoV SL1, 
SL2, and SL4 can functionally replace their counterparts in the MHV 
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genome when introduced individually (Kang et al., 2006). Unlike the indi- 
vidual hairpin structure substitution, replacement of the entire MHV 5’ 
UTR with that of SARS-CoV did not yield a viable MHV mutant, possibly 
indicating a requirement for stable or transient long-range RNA-RNA 
interactions of the 5’ UTR with other genome regions. Evidence to support 
this hypothesis was obtained in subsequent studies. For example, the ener- 
getically unstable lower part of MHV SL1 was found to be involved in long- 
range RNA interactions with the 3’ UTR (Li et al., 2008) (see later). Similar 
to the SARS-CoV data mentioned earlier, each of the four BCoV 5’-termi- 
nal SLs, SL1, SL2, SL4, and SL5a, was shown to functionally replace its 
MHV counterpart, yielding chimeric viruses with near-wild-type replica- 
tion kinetics. Furthermore, using MHV/BCoV chimera, a region down- 
stream of SL5 was revealed to be engaged in long-range interactions with 
the nspl-coding region, possibly forming an extensive higher-order 
RNA structure (Guan et al., 2012). Furthermore, a mutagenesis study using 
BCoV DI RNA (Su et al., 2014) indicated that this multipartite RNA struc- 
ture may involve several SL substructures identified in earlier studies (Gustin 
et al., 2009; Raman and Brian, 2005) but require refolding of other RNA 
structures suggested earlier to be essential for DI RNA replication (Brown 
et al., 2007). A recent study (Su et al., 2014) provided evidence that a short 
oligopeptide from the N-terminal domain of nsp1 may be an essential cis- 
acting protein factor involved in betacoronavirus replication, thus adding 
to the multiple other functions of this protein (Brockway and Denison, 
2005; Huang et al., 2011a,b; Kamitani et al., 2006, 2009; Lei et al., 2013; 
Lokugamage et al., 2012; Narayanan et al., 2008a; Tanaka et al., 2012; 
Tohya et al., 2009; Wathelet et al., 2007; Zust et al., 2007). 


3.1.2.1 Stem-Loops 1 and 2 

The 5/-proximal SL1 and SL2 are predicted to be conserved across all genera 
of the Coronavirinae (Chen and Olsthoorn, 2010; Liu et al., 2007). Nuclear 
magnetic resonance spectroscopy studies of MHV and HCoV-OC43 SL1 
RNAs revealed a functionally and structurally bipartite structure for this 
SL (Li et al., 2008). SL1 was proposed to exist in an equilibrium with higher- 
energy (partially unfolded) conformers. Characterization of MHV mutants 
containing specific replacements in SL1 and sequence analysis of second-site 
revertants support a “dynamic SL1” model in which the lower part of SL1 is 
required to have an optimally balanced stability/lability. The structural 
destabilization of the upper part of SL1 by disrupting specific base-pair inter- 
actions proved to be lethal or resulted in viruses with replication defects, 
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while compensatory mutations that restored the base pairing in the upper 
part of SL1 restored viral replication to near-wild-type levels, suggesting that 
efficient virus replication requires this part of SL1 to be base-paired. In con- 
trast, disruption of the basal part of SL1 was largely tolerated while compen- 
satory mutations that restored base pairing in the lower part proved to be 
lethal, suggesting a prominent role for RNA sequence rather than structure 
conservation in the lower part of SL1. Interestingly, the study also identified 
a possible link between SL1 and minus-strand subgenomic RNA synthesis 
(Li et al., 2008). The combined data presented in this study suggest that SL1 
requires an optimized stability suitable to establish or fine-tune transient 
long-range (RNA- or protein-mediated) interactions between the 5’ and 
3’ UTRs that may be required for sgR NA transcription and genome repli- 
cation. This hypothesis is also supported by deletion mutagenesis studies in 
which viable second-site (pseudo)revertants acquired other destabilizing 
mutations, most likely, to keep the stability of this structure below a certain 
threshold. Finally, several viable viruses were revealed to contain mutations 
in the 3/-UTR, providing genetic evidence for interactions between the 5’- 
and 3/-UTRs. 

SL2 is the most conserved structure in the coronavirus 5’ UTR 
(Chen and Olsthoorn, 2010; Liu et al., 2007). It is comprised of a 5-bp 
stem and a highly conserved loop sequence, 5/-CUUGY-3’, that was 
shown to adopt a 5/-uYNMG(U)a- or 5/-uCUYG(U)a-like tetraloop 
structure (Lee et al., 2011; Liu et al., 2009). Reverse genetics data con- 
firmed that SL2 is required for MHV replication and, possibly, sg mRNA 
synthesis. Within certain structural constraints, nucleotide replacements 
were found to be tolerated or could be rescued by increasing the stem sta- 
bility, suggesting a limited plasticity of this conserved cis-acting RNA ele- 
ment (Liu et al., 2009). 


3.1.2.2 Stem-Loop 3 

As mentioned earlier, SL3 (named SL-II in previous BCoV DI RNA studies) 
appears to be conserved in a small subset of beta- and gammacoronaviruses 
(Chen and Olsthoorn, 2010). For BCoV and SARS-CoV, the TRS-L core 
sequence (CS) has been predicted to be part of this SL3 hairpin loop, a struc- 
ture similar to the TRS-L hairpin structure reported for the related arterivirus 
equine arteritis virus (Chang et al., 1996; van den Born et al., 2004, 2005). In 
contrast to the situation in BCoV and SARS-CoV, the structure probing data 
obtained for MHV, HCoV-229E, and HCoV-NL63 suggest that the TRS-L 
CS and flanking regions are located in single-stranded regions (Chen and 
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Olsthoorn, 2010; Madhugiri et al., 2014; Stirrups et al., 2000; Wang and 
Zhang, 2000; Yang et al., 2015). 


3.1.2.3 Stem-Loop 4 

SL4 is a long hairpin structure located downstream of the TRS-L CS and has 
been suggested to be conserved across all coronavirus genera (Chen and 
Olsthoorn, 2010; Raman and Brian, 2005; Raman et al., 2003). Using a 
BCoV DI RNA system, a SL structure that was designated SLIII was 
mapped between nts 97 and 116 in the 5’-terminal genome region. The 
cis-acting function of SLIII was corroborated by studying effects of 
destabilizing mutations in this structural element (Raman et al., 2003). Sub- 
sequent studies by other laboratories confirmed these findings (Kang et al., 
2006; Liu et al., 2007). Genus-wide bioinformatics analyses revealed that 
SL4 is conserved in alpha- and betacoronaviruses (Madhugiri et al., 
2014). It is predicted to form a bipartite SL structure, comprised of 4a 
and 4b, the latter substructures being separated by a bulge (Kang et al., 
2006; Liu et al., 2007; Madhugiri et al., 2014). SL4b identified by various 
groups corresponds to the SLIII identified by Brian and coworkers (see ear- 
lier). Furthermore, SL4 was shown to contain a short ORF comprised of just 
a few codons. Because of its position in the genome, upstream of the large 
ORF a, it is generally referred to as the uORF. Recent reverse genetics 
work in the MHV system (Wu et al., 2014; Yang et al., 2011) showed that 
disruption of the uORF yields viable mutants that, however, evolve other 
uORFs upon serial passaging in cell culture. In vitro, uORF-disrupted 
RNAs showed enhanced translation of the downstream ORF, suggesting 
that the uORF represses ORF1a/1b translation and has a beneficial but non- 
essential role in coronavirus replication in cell culture. 

Even though the 5’-terminal SL4 is conserved across the Coronavirinae, 
this hairpin structure tolerates extensive mutations. For example, it was 
shown for MHV that base pairing in SL4a is not required for replication 
and also separate deletions of SL4a and SL4b were tolerated. By contrast, 
deletion of the entire SL4 and a 3-nt deletion immediately downstream 
of SL4 abolished or profoundly impaired viral RNA synthesis. Analysis of 
second-site mutations and experiments using a viable MHV mutant in 
which SL4 was replaced with a shorter SL with a heterologous sequence 
led to a model in which SL4 acts as a spacer element that controls the proper 
orientation of SL1, SL2, and TRS-L required for subgenomic RNA 
synthesis (Yang et al., 2011). The SL4 sequence overlaps with the 
“hotspot” of the 5/-proximal genomic acceptor required for BCoV 
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discontinuous transcription (Wu et al., 2006), thus further supporting a role 
of the region immediately downstream of TRS-L in subgenomic RNA syn- 
thesis. Based on these observations, it is reasonable to think that the structural 
flexibility of SL4 may be required to establish transient long-range RNA-RNA 
interactions. In line with this idea, a previous TGEV reverse genetic study 
showed that mutants permitting additional base-pairing interactions of the 
copy TRS-B upstream of a reporter ss RNA with the 5‘-GAAA-3’ sequence 
immediately downstream of the TGEV TRS-L CS (5/-ACUAAAC-3’) 
enhance the production of this particular reporter sgR NA (Zuniga et al., 
2004). Based on the available functional data and structural analyses of 
alphacoronavirus 5’-terminal genome regions, it was proposed that the basal 
part of SL4 exists in a flexible state, thereby possibly facilitating strand transfer 
during sg minus-strand RNA synthesis (Zúñiga et al., 2004). In addition to the 
inherent SL4 structural flexibility, proteins known to bind to this region may 
additionally modulate the stability of the SL4 structure, a hypothesis that 
remains to be investigated in further experiments. Of particular interest in this 
context, heterogeneous nuclear ribonucleoprotein (hnRNP) family members 
and the viral N protein have been shown to bind to this region and there is 
evidence that the N protein has chaperone functions and TRS-L/TRS-B 
unwinding activities (Galan et al., 2009; Grossoehme et al., 2009; Huang 
and Lai, 1999; Keane et al., 2012; Li et al., 1997, 1999; Shi and Lai, 2005; 
Sola et al., 2011a,b; Zúñiga et al., 2007, 2010). It is therefore tempting to spec- 
ulate that cellular and/or viral proteins bind and unwind the energetically 
labile SL4 substructure to facilitate the strand transfer during sg minus-strand 
RNA synthesis. 


3.1.2.4 Stem-Loop 5 

A 5’/-terminal SL designated earlier as SL-IV that extends into the nsp1 cod- 
ing sequence was described as an RNA element required for optimal MHV 
replication (Guan et al., 2011). The SHAPE analysis mentioned earlier 
suggests that SL5 contains three hairpin substructures, SL5a (previously 
designated as SL-IV), 5b, and 5c (Yang et al., 2015). Genus-wide analyses 
of 5/-terminal genome regions suggest a similar SL5 structure to be con- 
served in alphacoronaviruses, which includes three substructures called 
SL5a, 5b, and 5c (Chen and Olsthoorn, 2010; Madhugiri et al., 2014). In 
both alpha- and betacoronaviruses, SL5 extends into ORFla. Depending 
on the lineage studied, conserved loop sequences could be identified in 
the hairpin substructures of SL5. This sequence conservation was more 
pronounced in alpha- than in betacoronaviruses. In alphacoronaviruses, 
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each of the three hairpins (SL5a, 5b, and 5c) was found to contain a 
5’-UUCCGU-3’ loop sequence (Madhugiri et al., 2014). Equivalent struc- 
tures in betacoronaviruses were only partly conserved, with significant 
lineage-specific variations being detectable in the substructural hairpins 
and their terminal loop sequences. A possible SL5 equivalent in gamma- 
coronaviruses was predicted to adopt a rod-like structure that lacks con- 
served loop sequences (Chen and Olsthoorn, 2010). 

As outlined earlier, possible betacoronavirus SL5 substructures located 
within (or extending into) the nsp1-coding region (previously termed SLs 
IV, V, VI, and VII) have been characterized structurally and functionally 
using BCoV DI RNA and MHV reverse genetics systems (Brown et al., 
2007; Guan et al., 2011, 2012; Raman and Brian, 2005). In a BCoV-based 
DI RNA system, SL5A (previously designated as SL-IV) was revealed to be a 
cis-acting element essential for DI RNA replication (Brown et al., 2007). Ina 
recent MHV reverse genetic study, nucleotide substitutions that disrupt 
SL5C while preserving the N-terminal nsp1 amino acid sequence resulted 
in the recovery of viable mutant viruses with only moderate impairment of 
virus replication compared to wild-type virus, implying that SL5C is dis- 
pensable for viral replication (Yang et al., 2015) while earlier studies 
suggested this region to be required for accumulation and replication of a 
BCoV-based DI RNA (Brown et al., 2007). The reasons for these contra- 
dictory results are not clear but may be linked to limitations of DI-based rep- 
lication assays in which even small functional defects may result in a 
complete loss of DI RNA replication. Similar observations were made in 
other cases. For example, DI RNAs and recombinant viruses containing 
identical mutations in the 5’- and 3'‘-UTRs led to quite different phenotypes 
in some cases (Johnson et al., 2005; Yang et al., 2011), illustrating that 
reverse genetics systems based on full-length genomes are powerful and, 
in some cases, essential tools in functional studies of cis-acting elements. 


3.2 3/-Terminal cis-Acting RNA Elements 


The first studies of 3’ cis-acting elements required for RNA replication were 
based on betacoronavirus DI RNA systems (Kim et al., 1993; Lin and Lai, 
1993; Luytjes et al., 1996). Coronavirus 3’ UTRs range in size from ~300 to 
~500 nts (excluding the 3’ poly(A) tail). Using MHV DI RNAs, the min- 
imal length of 3’-terminal sequence required for replication was determined 
to involve 436 nts, including the entire 301-nt 3’ UTR, part of the 
N protein-coding sequence and the poly(A) tail (Lin et al., 1996; Luytjes 
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et al., 1996). In subsequent studies, the minimal 3’-terminal sequences 
required for TGEV (492 nts) and IBV (338 nts) DI RNA replication were 
determined (Dalton et al., 2001; Mendez et al., 1996). In both viruses, the 
cis-acting signal required for RNA synthesis could be mapped to the 3/- 
UTR (only), while N protein-coding sequences were not required. Similar 
observations were made for betacoronaviruses using recombinant MHV 
mutants. These studies demonstrated that the structural protein genes 
(including the N protein-coding region) tolerate substantial alterations 
including combinations of single-site mutations and rearrangements of 
entire genes, suggesting that the 3’-proximal coding regions are not part 
of the 3’ cis-acting elements (de Haan et al., 2002; Goebel et al., 2004b; 
Lorenz et al., 2011). Furthermore, studies by Enjuanes and coworkers 
suggested that the N gene was dispensable for replication of Alphacoronavirus 
1 using both TGEV and FCoV (Izeta et al., 1999). Also, deletions of the 
FCoV accessory protein genes 7a and 7b were shown to be tolerated, dem- 
onstrating that the 3! cis-acting replication signals of this virus involve only 
283 nts plus poly(A) tail (Haijema et al., 2004). For MHV, the minimal 3/- 
terminal cis-acting signal required for negative-strand (but not plus-strand) 
RNA synthesis was mapped to no more than 55 nts using a DI RNA-based 
system (Lin et al., 1994). Furthermore, a short poly(A) tract of at least 
5-10 nts was shown to be an essential cis-acting signal to support BCoV 
DI RNA replication (Spagnolo and Hogue, 2000). 


3.2.1 Structural Features of Coronavirus 3' cis-Acting Elements 

Also in this case, our knowledge of coronavirus 3’ cis-acting elements is largely 
based on studies using betacoronaviruses, such as MHV. A combination of 
bioinformatics, biochemical analyses, and functional studies was used to iden- 
tify and characterize cis-acting RNA elements in the 3’ UTR (Goebel et al., 
2004a, 2007; Hsue and Masters, 1997; Hsue et al., 2000; Liu et al., 2001, 2013; 
Stammler et al., 2011; Williams et al., 1999; Zust et al., 2008). More recently, 
these studies were extended to alphacoronaviruses using genus-wide bioinfor- 
matics analyses. A combination of sequence and structural alignments of all 
currently recognized alphacoronavirus species was used to identify conserved 
RNA structures in the 3’-terminal genome region and the predicted structures 
were then confirmed and refined using structure probing data obtained for 
HCoV-229E and HCoV-NL63 (Madhugiri et al., 2014). Fig. 1 provides a 
simplistic representation of the 3’-proximal RNA structures identified in beta- 
and alphacoronaviruses. 
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The 5’-most RNA structure in this region is a bulged stem-loop (BSL) of 
68 nts. It is located immediately downstream of the N gene stop codon and 
was shown to be required for MHV DI RNA replication (Hsue and Masters, 
1997; Hsue et al., 2000). Despite limited sequence similarity in this genome 
region, the BSL structure is predicted to be conserved in betacoronaviruses 
(Goebel et al., 2004a; Hsue and Masters, 1997). A possible BSL equivalent 
was also identified in IBV and other gammacoronaviruses and its functional 
importance was supported using IBV DI RNA constructs (Dalton et al., 
2001). The nearly perfect SL structure in IBV comprises 42 nts and is located 
at the upstream end of region II, a conserved region in the gamma- 
coronavirus 3’ UTR. Recent structural and bioinformatics analyses suggest 
that alphacoronavirus 3’ UTRs do not contain a structural equivalent of the 
betacoronavirus BSL (Madhugiri et al., 2014). 

The second essential RNA structure positioned 3’ to the BSL is a classical 
hairpin-type pseudoknot (PK) structure, which was first identified in BCoV. 
This 54-nt RNA element was identified as a cis-acting element required for 
BCoV DI RNA replication (Williams et al., 1999). Also the 3’-terminal 
genome regions of other betacoronaviruses, such as HCoV-HKU1 (Woo 
et al., 2005) and SARS-CoV, were found to contain this PK structure 
(Goebel et al., 2004b). Other studies suggested that this PK structure was 
conserved in beta- and alphacoronaviruses while gammacoronaviruses 
retained only some of the PK features or lacked this structure entirely 
(Williams et al., 1999). An interesting structural property of the BSL and 
the PK is that the elements overlap by five nucleotides in the primary struc- 
ture. This implies that they cannot exist simultaneously, at least not 
completely, which led to a model in which the BSL and PK are part of a 
“molecular switch” that regulates viral RNA synthesis. Evidence to support 
this model was obtained in an extensive MHV mutagenesis study (Goebel 
et al., 2004a). 

A recent bioinformatics study revisited the conservation of RNA struc- 
tural elements in the betacoronavirus 3’ UTR, including the BSL and the 
two SL structures that form the PK. The predictions were in excellent agree- 
ment with previous studies (Goebel et al., 2004a) and confirmed that, in all 
established betacoronavirus species, the formation of the PK requires struc- 
tural rearrangements at the base of the BSL to permit the base-pairing inter- 
actions required to form PK stem 1, the latter involving the loop sequence of 
the PK-SL2 element and the BSL 3/-terminal sequence (Madhugiri et al., 
2014). Interestingly, this study also revealed another conserved structural 
element, a short hairpin, immediately upstream of the PK-SL2 and suggested 
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that the formation of this hairpin may compete with base-pairing 
interactions required to form the basal part of the BSL and the PK stem 1, 
respectively. Furthermore, this hairpin overlaps partly with the PK loop 1 
region that, in a previous study, was suggested to interact with the extreme 
3’ end of the MHV genome (Ziist et al., 2008). The conservation of both 
structure and sequence of this hairpin supports a biological function of this 
element. In this context, it may be worth mentioning that the hairpin struc- 
ture is predicted to be disrupted by the 6-nt insertion in loop 1 that, previ- 
ously, was reported to cause a poorly replicating and unstable phenotype in 
MHV (Goebel et al., 2004a). It remains to be seen if the small hairpin 
represents yet another element in the intricate network of base-pairing 
interactions between the BSL, the PK, and the 3’ end that together consti- 
tute the complex molecular switch proposed by the Masters laboratory 
(Goebel et al., 2004a). 

Our recent study using representative viruses from all currently recog- 
nized alphacoronavirus species identified a number of conserved RNA 
structural elements in the alphacoronavirus 3’ UTR (Madhugiri et al., 
2014). As described earlier, a counterpart of the betacoronavirus BSL 
structure (Goebel et al., 2004a; Hsue and Masters, 1997) could not be iden- 
tified in the alphacoronavirus 3’ UTR, while structural elements required to 
form a PK structure were identified in all alphacoronaviruses (Madhugiri 
et al., 2014). Intriguingly, despite the absence of an upstream BSL in 
alphacoronaviruses, the formation of this putative PK structure was 
predicted to require the disruption of a short hairpin immediately upstream 
of PK-SL2, a scenario that is similar to (but less complex than) that described 
for betacoronaviruses. Further studies are required to answer the question of 
whether or not alphacoronaviruses employ a molecular switch mechanism 
similar to that employed by betacoronaviruses (Goebel et al., 2004a). Fur- 
thermore, our structure probing analyses supported the predicted PK-SL2 
structure for both HCoV-229E and HCoV-NL63 (Madhugiri et al., 
2014). They also supported base-pairing interactions upstream of the 
HCoV-NL63 SL2, thus supporting the formation of the predicted small 
hairpin in this region, while we failed to obtain experimental support for this 
hairpin in HCoV-229E. Also, the structure probing data did not support the 
formation of a stable PK structure, possibly reflecting a low thermodynamic 
stability as previously reported for the equivalent PK in betacoronaviruses 
(Stammler et al., 2011). Further experiments including reverse genetics 
studies are required to confirm the existence and biological significance 
of the predicted alphacoronavirus PK structure. 
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The 3’-most RNA secondary structure, a long multibranched SL struc- 
ture downstream of the pseudoknot was predicted and further confirmed by 
biochemical probing (Liu et al., 2001). For MHV, several of the stems in this 
region were reported to be required for efficient DI RNA replication. Using 
an MHV reverse genetic approach, Masters and coworkers demonstrated 
that the long hypervariable BSL structure is dispensable for viral replication 
(Goebel et al., 2007). The study by Madhugiri et al. (2014) revealed the con- 
servation of this RNA structural element downstream of PK-SL2 in all 
betacoronaviruses and, as expected, confirmed the conservation of the 
octanucleotide sequence, 5’-GGAAGAGC-3’, that has been identified pre- 
viously in the 3’ UTR of most coronaviruses (Goebel et al., 2007). The 
octanucleotide sequence was confirmed to be part of a single-stranded 
region. As pointed out earlier, the role of this conserved element is currently 
unclear as both the HVR and the octanucleotide sequence appear to be dis- 
pensable for MHV replication in vitro (Goebel et al., 2007; Liu et al., 2001). 

With respect to the HVR downstream of PK-SL2, an extensive SL struc- 
ture was predicted in bioinformatics analyses of alphacoronavirus 3’ UTRs 
(Madhugiri et al., 2014). The structure is supported by a large number of 
covariant base pairs and contains the conserved octanucleotide sequence 
in a single-stranded region, which could be corroborated by structure prob- 
ing data obtained for HCoV-229E and HCoV-NL63. Of note, the cell 
culture-adapted HCoV-NL63 isolate used in our study for structure probing 
analysis contained a short deletion (apparently acquired during serial passag- 
ing in cell culture) that resulted in a smaller loop but retained the 
octanucleotide sequence (with one G-to-A replacement) in a position iden- 
tical to that predicted for HCoV-229E (Madhugiri et al., 2014). This seren- 
dipitous deletion shows that the distal part of the extended SL structure is 
dispensable for HCoV-NL63 replication in cell culture. The data also 
suggested that, despite the deletion, the octanucleotide sequence retains a 
position in the loop region of the SL structure and tolerates minimal 
changes, the latter being consistent with MHV reverse genetics data 
obtained for the HVR/octanucleotide region (Goebel et al., 2007). 


3.2.2 Functional Roles of Coronavirus 3'-Terminal cis-Acting Elements 
Possible functions of RNA elements residing in the 3/-proximal genome 
regions have been studied most extensively in betacoronaviruses. Although 
the betacoronavirus 3’ UTRs have minimal sequence identity, the RNA 
structures conserved across different betacoronavirus lineages appear to be 
functionally equivalent as demonstrated in studies using viable chimeric 
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viruses. Intriguingly, this functional conservation of 3’-proximal RNA 
structures does not extend to alpha- and gammacoronaviruses because 
replacements of the MHV 3! UTR with that of TGEV and IBV, respec- 
tively, did not give rise to viable MHV mutants (Goebel et al., 2004b). 
The available evidence suggests that coronaviruses evolved several genus- 
specific cis-acting RNA elements. For example, the presence of a BSL 
followed by a PK structure is limited to betacoronaviruses, while other gen- 
era appear to contain only one of these elements, with the PK being con- 
served in alphacoronaviruses and the BSL in gammacoronaviruses (Dalton 
et al., 2001; Hsue and Masters, 1997; Williams et al., 1999). 


3.2.2.1 BSL and Pseudoknot 

The structures and several potentially important substructures of both the 
BSL and PK have been characterized in significant detail for BCoV and 
MHV (Goebel et al., 2004a; Hsue et al., 2000; Williams et al., 1999). As 
indicated earlier, the BSL and PK regions overlap by several nucleotides. 
Formation of the first stem of the PK structure requires base-pairing inter- 
actions with the downstream segment F of the BSL, thereby disrupting the 
basal part of this structure. In a comprehensive MHV mutagenesis study, the 
functional significance of both structures was demonstrated conclusively. 
Because the two structures cannot exist simultaneously and, yet, each of 
them is essential for viral replication, it was proposed that the two elements 
may adopt substructures that act as a “molecular switch” that controls the 
transition between different steps of the viral replication cycle (Goebel 
et al., 2004a). In a subsequent study, the proposed “molecular switch” 
was characterized in more detail and evidence was obtained to suggest a 
direct interaction between loop 1 of the PK with the extreme 3’ end of 
the MHV genome (Zust et al., 2008). The characterization of second-site 
revertants arising from MHV mutants with genetically engineered insertions 
in loop 1 revealed distinct replacements at the extreme 3’ end, thereby 
retaining specific base-pairing interactions with the loop 1 region and thus 
precluding the formation of stem 1 of the PK. Other mutants were found to 
contain second-site replacements indicative of RNA:protein interactions 
between the PK region and nsp8 and nsp9. Based on these data, a model 
was proposed in which the formation and disruption of the PK by differen- 
tial base-pairing interactions with the BSL and 3/-terminal genome 
sequences, respectively, may lead to alternate substructures that govern dif- 
ferent steps of the initiation and continuation of negative-strand RNA syn- 
thesis (Zust et al., 2008). Further evidence to support this model was 
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obtained in a subsequent MHV reverse genetics study by Liu et al. (2013). 
Thermodynamic investigations revealed a limited stability of the PK struc- 
ture (Stammiler et al., 2011). This structural flexibility is consistent with the 
proposed role as a “molecular switch.” 


3.2.2.2 Hypervariable Region 

The region downstream of the PK is less conserved among betacoronaviruses. 
It is generally referred to as the “hypervariable region (HVR)” and is not iden- 
tical to the “HVR” identified at the 5’ end of the 3’ UTR in IBV (Dalton 
et al., 2001; Williams et al., 1993). The betacoronavirus HVR was predicted 
to contain a complex and functionally relevant RNA structure based on enzy- 
matic probing and MHV DI RNA mutagenesis data (Liu et al., 2001). By 
contrast, more recent studies showed that large parts or even the entire 
HVR region can be deleted without causing major defects in MHV replica- 
tion, arguing against an essential role of this genome region in viral replication 
(Goebel et al., 2007; Zust et al., 2008). However, some of the MHV HVR 
mutants proved to be highly attenuated in vivo, suggesting a possible role in 
pathogenesis (Goebel et al., 2007). 

The conserved octanucleotide sequence mentioned earlier, 5- 
GGAAGAGC-3', was identified in early coronavirus sequence analyses per- 
formed in the late 1980s (Boursnell et al., 1985; Lapps et al., 1987; Schreiber 
et al., 1989). Subsequent studies confirmed its universal conservation across 
all coronavirus genera, with only very few viruses containing single replace- 
ments in this sequence (Goebel et al., 2007). Obviously, this strict conser- 
vation suggests an important functional role which, however, could not be 
confirmed to date. As mentioned earlier, the entire HVR including the 
octanucleotide sequence can be deleted from the MHV genome without 
causing major defects in viral replication in vitro (Goebel et al., 2007). In 
line with this, replacements of single nucleotides within the octanucleotide 
motif were tolerated although most of these mutants exhibited small-plaque 
phenotypes and/or delayed single-step growth kinetics. In both high- and 
low-multiplicity-of-infection experiments, octanucleotide and HVR dele- 
tion mutants lagged behind the wild-type virus but reached near-wild-type 
titers at later time points and had no detectable defect in viral RNA synthesis 
(Goebel et al., 2007). 


3.2.2.3 3’-Terminal Poly(A) Tail 
MHV and BCoV DI RNA studies showed that the poly(A) tail at the 3’ end 
of coronavirus genomes is essential, with a minimum of 5—10 adenylate 
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residues being required for DI RNA replication (Spagnolo and Hogue, 
2000). This requirement corresponds well to the minimal binding site of 
the poly(A)-binding protein (PABP) on DI RNAs poly(A) sequences 
(Spagnolo and Hogue, 2000). Recent studies further suggest that 3’ 
poly(A) tail lengths may vary between 30 and 65 nt in the course of viral 
replication in vitro (Wu et al., 2013) as was shown for both beta- and 
gammacoronavirus infections and in a range of cell types, both in vitro 
and in vivo (Shien et al., 2014). The biological significance of these obser- 
vations is currently unclear. 


4. RNA ELEMENTS INVOLVED IN CORONAVIRUS 
GENOME PACKAGING 


To selectively package their genome RNA (rather than other viral and 
cellular RNAs), viruses employ distinct cis-acting sequences in the viral 
genome RNA and frans-acting viral factors (Annamalai and Rao, 2006; 
D’Souza and Summers, 2005; Nugent et al., 1999). Even though cor- 
onaviruses produce large amounts of subgenomic mRNAs in infected cells, 
these RNAs are not (or extremely inefficiently) incorporated into virus par- 
ticles (Escors et al., 2003), suggesting that coronaviruses have evolved spe- 
cific mechanisms to efficiently package their genome RNA into progeny 
virus particles. 

Like for other coronavirus cis-acting RNA elements, the genomic pack- 
aging signal (PS) was first discovered by DI RNA studies using MHV 
(Makino et al., 1990; van der Most et al., 1991). PSs of alpha- and 
gammacoronaviruses were first identified for TGEV and IBV (Escors 
et al., 2003; Penzes et al., 1994). MHV DI RNA studies revealed a 69-nt 
SL structure that was (i) located in the 3’ region of ORF1b, 
(ii) confirmed to be required for DI RNA packaging, and (iii) shown to 
interact with the viral N protein (Fosmire et al., 1992; Molenkamp and 
Spaan, 1997; Woo et al., 1997). Subsequent studies indicated that a larger 
PS element and, possibly, additional factors are required for optimal pack- 
aging efficiency (Bos et al., 1997; Cologna and Hogue, 2000; Narayanan and 
Makino, 2001). More recently, a PS that is conserved in lineage 
A betacoronaviruses and a novel 95-nt BSL were predicted and supported 
by chemical and enzymatic probing experiments (Chen et al., 2007). 
The conservation of this PS among lineage A coronaviruses is consistent 
with earlier observations that the BCoV PS is functionally replaceable with 
its MHV counterpart (Cologna and Hogue, 2000). Remarkably, this 
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structurally and functionally conserved PS of lineage A betacoronaviruses is 
not conserved in other lineages of betacoronaviruses and other coronavirus 
genera (Kuo and Masters, 2013), suggesting differential requirements for 
genome packaging among closely related coronaviruses. 

For TGEV, the PS was identified using genetically engineered DI RNAs 
(Izeta et al., 1999). Deletion analyses revealed a minimal TGEV PS required 
for efficient packaging. This PS contained nts 100-649 from the 5’-proximal 
genome region (Escors et al., 2003). Also for IBV, a DI RNA that was effi- 
ciently packaged has been isolated and characterized (Penzes et al., 1994), 
even though, in this case, the mapping of a possible PS produced inconclu- 
sive data (Dalton et al., 2001). The TGEV and IBV studies support the 
notion above that coronavirus PSs are found in different genome regions 
(Escors et al., 2003; Penzes et al., 1994) which, to some extent, is reminis- 
cent of the situation described for picornavirus cre elements (Steil and 
Barton, 2009). 

Further insight into the role of PS in genome RNA packaging into 
virions was obtained in a recent study using MHV (Kuo and Masters, 
2013). The study provides conclusive evidence that (i) the PS supports selec- 
tive packaging of the viral gnome RNA into virions and (ii) remains func- 
tional when transposed to an ectopic genomic site. Surprisingly, this study 
also revealed that the PS is not essential for MHV viability and viral growth 
in cell culture, suggesting that the principal role of the MHV PS is to ensure 
selective packaging of viral genome RNA into virions. Further insight into 
conserved and distinct properties of coronaviruses PSs can be expected from 
future studies using viruses representing all established coronavirus genera. 


5. POSSIBLE ROLES OF CELLULAR PROTEINS 
IN CORONAVIRUS REPLICATION 


A number of studies have addressed possible roles of cellular 


proteins in coronavirus (mainly MHV) RNA synthesis. In these studies, 
several members of the hnRNP family (PTB or hnRNP A1, SYNCRIP) 
were identified based on their ability to bind to viral RNA fragments con- 
taining TRS (TRS-L as well as TRS-B) in vitro and, in some cases, to 
affect MHV replication (Choi et al., 2004; Furuya and Lai, 1993; Li 
et al., 1997; Zhang and Lai, 1995). Deletion analysis and site-directed 
mutagenesis of the binding regions of PTB or hnRNP A1 further demon- 
strated significant inhibition of RNA transcription (Li et al., 1999). Fur- 
thermore, the functional importance of hnRNP in coronavirus RNA 
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replication was demonstrated by overexpressing wild-type hnRNP A1 ora 
dominant-negative form of hnRNP A1 in cells (Shi et al., 2000, 2003). It 
was also shown that hnRNP A1 interacts with the viral N protein (both 
in vitro and in vivo), suggesting that this protein may become part of 
the RTC (Shi et al., 2000; Wang and Zhang, 1999). Members of the 
hnRNP family (hnRNP A1 and PTB) were shown to bind to 5’ UTR 
and 3’ UTR and were suggested to mediate a cross talk between 5/- 
and 3/-terminal genome regions (Huang and Lai, 1999, 2001). Further- 
more, it was reported that interactions of hnRNP A1 and PTB modulate 
viral RNA synthesis and SYNCRIP silencing leads to reduced virus pro- 
duction (Choi et al., 2004). Similar observations were made for TGEV. It 
was shown that PTB binds to the TGEV TRS-L sequence while other 
hnRNP family members were found to bind to the 3’ end of the genome 
(hnRNP Q, hnRNP A2B1, and hnRNP AO) (Galan et al., 2009; Sola 
et al., 2011b). Furthermore, silencing of hnRNP Q expression showed 
a significant reduction in TGEV RNA synthesis and virus production, 
supporting biologically relevant functions of hnRNP family members in 
coronavirus RNA synthesis (Galan et al., 2009). Host factors that interact 
with specific 5’ cis-acting structures have only been described for BCoV 
(Raman and Brian, 2005). These host factors were revealed to bind to 
SL 5a (previously designated as SL-IV). 

Proteins that specifically interact with coronavirus 3’ UTRs have mainly 
been identified by UV-crosslinking experiments using MHV, BCoV, and 
TGEV terminal genome sequences (Sola et al., 2011b). Members of the 
hnRNP family and several other proteins were shown to interact with 
the 3’ UTR and have been suggested to have a role in negative-strand as 
well as positive-strand (genomic and subgenomic) RNA synthesis (Sola 
et al., 2011b). For BCoV, immunoprecipitation experiments revealed sev- 
eral proteins, including PABP, to interact with the poly(A) tail, another 
important cis-acting element for coronavirus replication (Spagnolo and 
Hogue, 2000). As mentioned earlier, several proteins, including PABP, 
could be enriched by RNA affinity chromatography using the TGEV 3! 
UTR (Galan et al., 2009). Silencing of PABP, hnR NP Q, and glutamyl- 
prolyl-tRNA synthetase expression led to a two- to threefold reduction 
in viral RNA synthesis, suggesting that host factors that specifically interact 
with viral cis-acting elements may affect (or even be essential for) viral RNA 
replication. Clearly, the possible functions of these cellular factors deserve 
further investigation. 
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In addition to their interactions with cis-acting RNA elements, cellular 
proteins were found to interact with specific coronavirus nsps. For example, 
the purification and characterization of enzymatically active SARS-CoV 
RTCs showed that cellular factors may enhance viral RdRp activity (van 
Hemert et al., 2008). Also, cellular DEAD-box-family helicases, such as 
DDX5 and DDX1, have been implicated in coronavirus RNA synthesis. 
Specific interactions of the DDX5 protein with the SARS-CoV helicase, 
nsp13, were confirmed in yeast and mammalian two-hybrid and co- 
immunoprecipitation experiments. Silencing of DDX5 expression led to 
reduced viral RNA replication and virus titers, supporting the biological sig- 
nificance of this interaction (Chen et al., 2009a). Similarly, in IBV and 
SARS-CoV, interactions between DDX1 and nsp14 were identified by 
yeast two-hybrid and coimmunoprecipitation assays (Xu et al., 2010) and 
validated by showing that knockdown of DDX1 expression affects corona- 
virus RNA replication and transcription. Similar conclusions were drawn 
from TGEV TRS interaction studies. Also in this context, the DDX1 
helicase was suggested to have a role in coronavirus replication (Sola 
et al., 2011b). 


6. CONCLUSIONS AND OUTLOOK 


Over the past years, a large number of studies using structural, bio- 
chemical, and reverse genetics approaches have provided important new 
insight into cis-acting elements that drive and control coronavirus RNA rep- 
lication (reviewed in Masters, 2007; Sola et al., 2011b). In many cases, these 
studies used betacoronaviruses, while alpha- and gammacoronaviruses were 
studied to a lesser extent and there is essentially no information on del- 
tacoronaviruses. This work also identified a growing number of cellular 
and viral proteins that bind to these structures and may have functions in 
genomic and/or subgenomic RNA synthesis, genome packaging, genome 
expression, or intracellular targeting of factors/structures engaged in viral 
RNA synthesis (reviewed in Narayanan and Makino, 2007; Sola et al., 
2011b). 

Recent bioinformatic studies suggest that the RNA secondary structure 
elements identified to date for only a small number of coronaviruses 
may be significantly more conserved than previously thought, both within 
and across the four coronavirus genera (Chen and Olsthoorn, 2010; 
Madhugiri et al., 2014; Yang et al., 2015). These studies also provide 
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evidence to suggest a coevolution of RNA structures in the terminal 
genome regions with the viral replication machinery. Consistent with 
this hypothesis, the level of conservation of 5'- and 3/-terminal cis-active 
RNA elements among different coronavirus genera and lineages was found 
to be largely consistent with the replicase gene-based classification of 
the Coronavirinae (de Groot et al., 2012a; Madhugiri et al., 2014). The most 
conserved elements identified to date include SL 1, 2, and 4 (possibly, also 
SL 5) in the 5’ genome region and a putative PK in the 3'-UTR. The pre- 
cise roles of these structures and the viral and cellular proteins that 
bind these structures to perform specific steps in viral RNA synthesis 
remain to be investigated in more detail. Another interesting aspect to 
be explored in future studies should address a possible role of the corona- 
virus 3‘-UTR in specific virus—host interactions and/or pathogenesis 
(Goebel et al., 2007). 
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